Grounding of Word Meanings in Latent Dirichlet Allocation-Based Multimodal Concepts

نویسندگان

  • Tomoaki Nakamura
  • Takaya Araki
  • Takayuki Nagai
  • Naoto Iwahashi
چکیده

In this paper we propose a latent Dirichlet allocation (LDA)-based framework for multimodal categorization and words grounding by robots. The robot uses its physical embodiment to grasp and observe an object from various view points, as well as to listen to the sound during the observing period. This multimodal information is used for categorizing and forming multimodal concepts using multimodal LDA. At the same time, the words acquired during the observing period are connected to the related concepts, which are represented by the multimodal LDA. We also provide a relevance measure that encodes the degree of connection between words and modalities. The proposed algorithm is implemented on a robot platform and some experiments are carried out to evaluate the algorithm. We also demonstrate simple conversation between a user and the robot based on the learned model. © Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2011

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Word Meanings and Grammar for Describing Everyday Activities in Smart Environments

If intelligent systems are to interact with humans in a natural manner, the ability to describe daily life activities is important. To achieve this, sensing human activities by capturing multimodal information is necessary. In this study, we consider a smart environment for sensing activities with respect to realistic scenarios. We next propose a sentence generation system from observed multimo...

متن کامل

Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots

In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., "I am in my home" and "I am in front of the table," a hier...

متن کامل

Unsupervised Disambiguation of Image Captions

Given a set of images with related captions, our goal is to show how visual features can improve the accuracy of unsupervised word sense disambiguation when the textual context is very small, as this sort of data is common in news and social media. We extend previous work in unsupervised text-only disambiguation with methods that integrate text and images. We construct a corpus by using Amazon ...

متن کامل

SuMACC Project's Corpus - A Topic-Based Query Extension Approach to Retrieve Multimedia Documents

The SuMACC project aims at automatically tracking new multimodal entities on Internet. The goal of the project is to propose robust multimedia methods that define relevant patterns allowing to automatically retrieve these entities. This paper describes the SuMACC corpus collected on video-sharing platforms using word-queries. Since concepts are limited to a single or few words, querying video-s...

متن کامل

Unsupervised Domain Tuning to Improve Word Sense Disambiguation

The topic of a document can prove to be useful information for Word Sense Disambiguation (WSD) since certain meanings tend to be associated with particular topics. This paper presents an LDA-based approach for WSD, which is trained using any available WSD system to establish a sense per (Latent Dirichlet allocation based) topic. The technique is tested using three unsupervised and one supervise...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Advanced Robotics

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2011